Managing application interactions using distributed modality components

ABSTRACT

A method for managing multimodal interactions can include the step of registering a multitude of modality components with a modality component server, wherein each modality component handles an interface modality for an application. The modality component can be connected to a device. A user interaction can be conveyed from the device to the modality component for processing. Results from the user interaction can be placed on a shared memory are of the modality component server.

CROSS REFERENCE TO RELATED APPLICATIONS

This Application is a continuation of U.S. application Ser. No.12/135,651, entitled “MANAGING APPLICATION INTERACTIONS USINGDISTRIBUTED MODALITY COMPONENTS” filed on Jun. 9, 2008, which is acontinuation of U.S. application Ser. No. 10/741,499, entitled “MANAGINGAPPLICATION INTERACTIONS USING DISTRIBUTED MODALITY COMPONENTS” filed onDec. 19, 2003, now U.S. Pat. No. 7,401,337, each of which isincorporated by reference herein in its entirely.

BACKGROUND

1. Field of the Invention

The present invention relates to the field of computer software and,more particularly, to multimodal applications.

2. Description of the Related Art

A multimodal application is an application that permits userinteractions with more than one input mode. Examples of input modesinclude speech, digital pen (handwriting recognition), and the graphicaluser interface (GUI). A multimodal application may, for example, acceptand process speech input as well as keyboard or mouse input. Similarly,a multimodal application may provide speech output as well as visualoutput, which can be displayed upon a screen. Multimodal applicationscan be particularly useful for small computing devices possessing aform-factor that makes keyboard data entry more difficult than speechdata entry. Further, environmental conditions can cause one interfacemodality available in a multimodal application to be preferred overanother. For example, if an environment is noisy, keypad and/orhandwritten input can be preferred to speech input. Further, when visualconditions of an environment, such as darkness or excessive glare, makea screen associated with a computing device difficult to read, speechoutput can be preferred to visual output.

Although users of small computing devices can greatly benefit frommultimodal capabilities, small computing devices can be resourceconstrained. That is, the memory and processing power available to asmall computing device can be too limited to support the local executionof more than one mode of interaction at a time. To overcome resourceconstraints, multimodal processing can be distributed across one or moreremote computing devices. For example, if one mode of interaction isspeech, speech recognition and synthesis processing for the speech modecan be performed upon a speech-processing server that is communicativelylinked to the multimodal computing device. Software developers face asignificant challenge in managing distributed multimodal interactions,some of which can be executed locally upon a computing device, whileother interactions can be executed remotely.

Conventional solutions to distributed multimodal interaction managementhave typically been application specific solutions that have beendesigned into an application during the application's softwaredevelopment cycle. Accordingly, the features available for eachmodality, such as speech recognition features, are typically tightlyintegrated within the software solution so that future enhancements andadditional features can require extensive software rewrites. Becausehardware and software capabilities are constantly evolving in the fieldof information technology, customized solutions can rapidly becomeoutdated and can be costly to implement. A more flexible,application-independent solution is needed.

SUMMARY OF THE INVENTION

The present invention provides a method, a system, and an apparatus formanaging a multimodal application using a set of activation conditionsand data placed on a shared memory area, where one set of applicationconditions are defined by an application developer. More specifically, amultitude of modular modality components can be provided, each of whichcan perform tasks for a particular modality. Each input modalitysupported by the application can be handled by a modality component.Particular ones of the modality components and/or portions thereof canbe local to the multimodal application, while others can be remotelylocated from the application. The multimodal application can communicatewith a resource constrained thin client capable of locally executing alimited subset of available modalities at any one time.

The multimodal application can selectively utilize registered modalitycomponents in a dynamic manner as needed. For example, a speechrecognition component can be fired and connected directly to the thinclient to perform text output and input recognition tasks. The resultsof the speech input recognition are placed on the shared memory area.

It should be noted that each modality component can place its own set ofactivation conditions and data on a shared memory area of the modalitycomponent server. The set of activation conditions submitted by modalitycomponent defines how it can be activated, and how input and outputbetween the modality component and the client device can be started andstopped. A special modality component, called the application module,can be used to add and remove authored activation conditions. Theapplication module can also be activated based an occurrence of one ofthe application conditions that was in turn initiated by an applicationevent. The activation conditions defined for the application incombination with the state of objects in shared memory can be used toselect different modality components as needed to perform inputrecognition and output synthesis, as well as to interpret data submittedby multiple modality components for complex multimodal interactions.

One aspect of the present invention can include a method for managingmultimodal interactions. The method can include the step of registeringa multitude of modality components, wherein each modality component canhandle an interface modality for an application. In one embodiment, oneor more of the modality components can be remotely located from adevice. Further, the device can lack the resources to locally execute atleast one function that is handled by the remotely located modalitycomponent. The device can also contain one or more locally disposedmodality components. Once the multimodal application has beeninstantiated and modality components registered, a registered modalitycomponent can be activated and connected to the device. Once connectedto the device, a user interacts with the device and the connectedmodality. The results of the interaction are placed on the shared memoryarea of the modality component sewer.

In one embodiment, a list of activation conditions can be establishedfor each modality component. Appropriate modality components can beutilized whenever one of the listed activation conditions is detected.In another embodiment, the method can detect a condition that indicatesthat one of the registered components is required. Upon detecting thiscondition, the registered modality component can be used to perform aprogrammatic action.

Another aspect of the present invention can include a modality componentserver that includes a modality activator, a multimodal engine, and/or amodality interface. The modality activator can manage proxies residingon the server for each of the modality components. The modalityactivator can also dynamically disconnect the modality component fromthe device and deactivate the modality component responsive to acompletion of an interaction response. The modality interface canstandardize data exchanged between modality components and themultimodal application.

The multimodal engine includes an inference engine, a list of activationconditions, and a shared memory area. The inference engine matches theactivation conditions to the current state of the shared memory area.The activation condition that matched may activate an appropriatemodality component, or activate the application module as part of acomplex multimodal interaction. A multimodal interaction may involvemore than one modality component. The multimodal engine can detect aninteraction specified by a modality component and can responsivelyinitiate an interaction response, which can also be specified by themodality component. The multimodal engine can detect interactionsdefined by any modality component and initiate appropriate interactionsdefined by any modality component and initiate appropriate actions upondetecting the interactions. The multimodal engine can manage multimodalinteractions involving multiple modality components.

In another embodiment, the application can be accessed remotely from athin client. The thin client can lack sufficient resources tosimultaneously enable a set of interface modalities supported by themultimodal application. That is, substantial resources are not consumedon the thin client because the modality overhead is handled remotely bythe modality component server and modality components. The thin clientcan utilize any of the modalities supported by the multimodalapplication by having the modality component server activate themodalities that the device supports.

In another embodiment, the multimodal engine can manage complexmultimodal interactions that involve more than one multimodal component.Each multimodal component can place the results of its interaction withthe user on the multimodal engine's shared memory area. Each result canbe placed on the shared memory area as an object that can containvarious properties, such as timestamps and confidence levels. Theinference engine can run a list of activation conditions against acurrent list of objects in the shared memory area. One or moreactivation conditions that match the current state of the shared memoryarea can be selected. An activation condition submitted by theapplication module may be one of those selected.

The application module can resolve a multimodal interaction involvingmultiple modality components. The application's activation condition canresolve the references to missing data in one of the modalitycomponent's submitted results, where the missing information can becontained in the results submitted by another modality component. Forexample, the activation condition may be, “if speech object has “here”in it, and if a digital pen gesture object is added to the shared memoryarea within five seconds, activate the application module.”

BRIEF DESCRIPTION OF THE DRAWINGS

There are shown in the drawings, embodiments that are presentlypreferred; it being understood, however, that the invention is notlimited to the precise arrangements and instrumentalities shown.

FIG. 1 is a schematic diagram illustrating a system for handlingapplication modalities in a modular fashion in accordance with theinventive arrangements disclosed herein.

FIG. 2 is a schematic diagram illustrating a system for a multimodalapplication that manages distributed modality components in accordancewith the inventive arrangements disclosed herein.

FIG. 3 is a flow chart illustrating a method for managing multimodalinteractions in accordance with the inventive arrangements disclosedherein.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 is a schematic diagram illustrating a system 100 for handlingapplication modalities using dialog states of the application inaccordance with the inventive arrangements disclosed herein. The system100 can include a modality component server 102, a modality serviceserver 150, a thin client 130, and a multimodal application server 112.

The thin client 130 can be a computing device of limited capabilitiesthat can rely upon one or more backend servers to perform at least aportion of its processing tasks. The thin client 130 can be, forexample, a personal data assistant (PDA), a cellular telephone, anelectronic book, an electronic contact manager, a data tablet, and anyother such computing device. The thin client 130 can lack sufficientresources to simultaneously enable a set of interface modalitiessupported by a multimodal application 105 operating upon the thin client130. The thin client 130 can report the input modes that it supports toa modality component dedicated to the configuration of the devicerunning the application. Information supplied by the thin client 130 canthen be submitted to the modality component server 102. Those modalitycomponents that support the input modes supported by the device can bedynamically activated and deactivated by the modality component server102 as needed.

Each modality can represent a particular input or output methodology.For example, graphical user interface (GUI) based modalitiescan-include, but are not limited to, a keyboard input modality, a mouseselection modality, a screen touch modality, a visual display modality,and the like. Speech based modalities can include a speech inputmodality, a synthetic speech generation modality, a Dual Tone MultipleFrequency (DTMF) modality, and the like.

The multimodal application 105 can be a software application, whichsupports interactions via more than one modality. In one embodiment, themultimodal application 105 can by a stand-alone, local applicationresiding upon the thin client 130. In another application, themultimodal application 105 can be a client-side module that interactswith the multimodal application server 112. The multimodal applicationserver 112 can be a remotely located application that manages a portionof the processing tasks relating to the multimodal application 105. Themultimodal application server 112 will typically be used in situationswhere resources directly provided by the thin client 130 are verylimited compared to the computing resource required for applicationoperations.

The modality component server 102 can manage a plurality of modalitiesthat have been modularly implemented so that the functions and featuresfor a particular modality can be contained within a specificallydesigned modality component. When a modality component is activated, thecomponent can interact directly with the device for input recognitionand output synthesis. Different modality components used by the modalitycomponent server 102 can be executed to interact with the device fromdifferent locations. For example, a modality component can be locallyexecuted with respect to the modality component server 102, a modalitycomponent or portions thereof can be remotely executed upon the modalityservice server 150, and/or a modality component can be executed upon thethin client 130. The modality component server 102 can coordinateactions and events relating to the multimodal application 105,regardless of where the modality component is located.

The modality component server 102 can include a set of active modalities110 and a set of available modalities 115 that are not presently active.The modality component server 102 can process events and match theactivation conditions of the modality components with the current stateof a shared memory area. As a result of the matching, modalitycomponents are activated. One particular modality component is theapplication module. The application model utilizes modality componentobjects placed in a shared memory area to respond to user events.

The modality service server 150 can assign modality processing tasks toa modality component from a remote location. The modality service server150 can convey data 155 across a network 145 to the modality componentserver 102 and/or the multimodal application 105. In one embodiment, themodality service server 150 can provide a Web service for a specifiedmodality component. The Web service can be, for example, a naturallanguage comprehension service, a text-to-speech service, a languagetranslation service, and the like.

In another embodiment, the modality service server 150 can include amultitude of functions available through remote procedure call (RPC)routines. It should be appreciated that the data 155 provided by themodality service server 150 can be conveyed in any of a variety ofmanners and the invention is not to be limited in this regard. Forexample, message queuing and advanced program-to-program communications(APPC) can be used to convey the data 155 to the multimodal application105 and/or the modality component server 102. The interaction datatransferred between the modality component and the device can also beencoded into a compression format.

In one embodiment, the modality component server 102 can include aninterface 120 used to standardize data conveyances. The interface 120can define rules, data formats, parameters, and the like for complyingwith the architecture of the multimodal application 105 and/or themodality component server 102. Any of a variety of routines, libraries,data adaptors, networking mechanisms, and the like can be includedwithin the interface 120 to facilitate the exchange of data.

For example, in one embodiment, the interface 120 can include anapplication program interface (API) defined for the multimodalapplication 105 and/or the modality component server 102. In anotherembodiment, the interface 120 can convert responses received from themodality service server 150 from a format native to the modality serviceserver 150 to a format compatible with the multimodal application 105.In yet another embodiment, the interface 120 can include a plurality ofprotocol adaptors to establish network communications with the modalityservice server 150.

In operation, a multitude of modality components provided from differentlocations can register with the modality component server 102, therebybecoming available 115 modality components. One of these modalitycomponents can include an application module. When the modalitycomponent is registered, details for the modality component includinglinks for activating a modality component and firing modality componentroutines can be specified. Registered modality components can include anapplication module provided by a multimodal application server 102 anddevice configuration module provided by the thin client 130. In oneembodiment, the resource requirements specified within the applicationmodule and the resources available as specified through the deviceconfiguration module can be used by the modality component server 102when selecting which available 115 modality components are to becomeactive 110.

After modality components have been registered with the modalitycomponent server 102, the thin client 130 can instantiate the multimodalapplication 105. A multitude of available 115 modality components canbecome active 110 components for the application instance. An active 110modality component is one having a multitude of software objectsenabled, where each software object controls one or more modality tasks,as well as communication to the device to directly handle userinteraction. Modality software objects can be placed in a shared memoryarea or “white board” of the modality component server 102. Thedifferent software objects within the shared memory area can be used tocoordinate application interactions between the modality components.

For example, an initial modality component, such as a GUI modalitycomponent, can be activated based upon an initial dialogue state of themultimodal application 105. When activated within the modality componentserver 102, the GUI modality can be added to the active modalities 110and events specified for the GUI modality component can be monitored.During the lifetime of the application, one or more GUI software objectsprovided by the GUI modality component can be added to the shared memoryarea. Data necessary to execute GUI modality functions for themultimodal application 105 can then be enabled upon the thin client 130.Enabling these GUI modality functions can involve adding softwareobjects for the application to the shared memory area.

Input and output data relating to user interactions are transferreddirectly between the thin client 130 and the modality component 110.Each of these user interactions may have results that can be comparedagainst activation conditions. The activation conditions are run by theinference engine after one of the various software objects of thevarious modality components is enabled within the shared memory area.The activation conditions that fire programmatic actions can bedynamically adjusted as different software objects are placed within theshared memory area.

FIG. 2 is a schematic diagram illustrating a system 200 for a multimodalapplication that manages distributed modality components in accordancewith the inventive arrangements disclosed herein. The system 200 caninclude a thin client 230, and at least one modality component 235, anda modality component server 205. The thin client 230 can possess thestructural characteristics and functions ascribed to the thin client 130of FIG. 1. In one embodiment, device specific information concerning thethin client 230 can be conveyed to the modality component dedicated tothe device's configuration. This modality component in turn can conveythe configuration data to the modality component server 205 so thatsuitable device specific parameters can be established and behavior ofthe modality component server 205 adjusted in a device specific manner.

The modality component 235 can be a modular software unit that handlesinteractions relating to a particular modality for a multimodalapplication executed upon the thin client 230. The modality component235 can include, but is not limited to, a speech component, ahandwriting component, a DTMF component, a keypad entry component, a GUIcomponent, and the like. Collaborations that can exist between differentmodality components 235 are handled by the modality component server205. For example, a speech component can perform a speech recognitiontask resulting in a speech input being converted into textual output.The textual output can be displayed within a GUI text element, which isdisplayed using features of a GUI component. An application module canalso be provided, which is also a modality component.

Each of the modality components 235 can be registered with the modalitycomponent server 205. Registration can provide the modality componentserver 205 with information necessary to dynamically activate themodality components 235 as needed. The modality component 235 can belocal to the modality component server 205 and/or thin client 230 or themodality component 235 can be remotely located.

The modality component server 205 can be a software application, whichsupports coordinate interactions of a multimodal application runningupon a resource restricted thin client 230. The modality componentserver 205 can include an interface 225, a multimodal engine 210, and amodality activator 220.

In one embodiment, the interface 225 can possess the structuralcharacteristics and functions ascribed to the interface 120 of FIG. 1.The interface 225 can be used to facilitate the conveyance ofinteraction data between the modality component 235 and the modalitycomponent server 205.

The modality activator 220 can be used to dynamically activate and/ordeactivate the modality components 235 as appropriate. For example, themodality activator 220 can be a listener within an event/listenerpattern that can trigger operations of modality components based onmatches occurring within the interference engine 215. That is, themodality activator 220 can initiate one or more proxy clients thatmanaging operations of registered modality components.

The multimodal engine 210 can include an inference engine 215, a sharedmemory area 217, and a rule data store 219. The shared memory area 217can be a common memory space in which modality objects 250, 252, 254,and 256 are placed. Each of the modality objects can represent asoftware object provided by a specific modality component 235. Differentactivation conditions can be loaded into/removed from the multimodalengine 210 by the application module in accordance with the dialoguestate of the application as specified by modality objects enabled withinthe shared memory area 217. When a modality component 235 isdeactivated, the modality objects associated with the modality component235 can be removed from the shared memory area 217.

A multitude of activation conditions specified within the activationcondition data store 219 can cause operations of modality objects thathave been placed in the shared memory area 217 to be executed.Interaction events can trigger the firing of modality operationsassociated with specified activation conditions. These operationsinclude, for example, text to speech output and enabling inputrecognition for a speech modality component.

The inference engine 215 runs application conditions in the activationcondition data store 219 in response detection of the applicationevents. An application event can be an assertion of new data, such as aninterpretation of user input, by a multimodality component 235. Forexample, an application event can be an on-focus event or a mouse-clickevent resulting from a user interaction within a particular modality.The application event can also be a system event. For example, a systemevent can be triggered whenever the resources available to the modalitycomponent server 205 fall below a designated threshold. The modalitycomponent server 205 can update and modify the events data containedwithin inference engine 215 in accordance with the modality objectscontained enabled within the shared memory area 217.

All active modality components 235 assert a modality event when theyupdate the shared memory area 217. In response to the modality event,the inference engine 215 runs the activation conditions stored in theactivation condition data store 219 against the current state of theshared memory area 217. Those activation conditions that match thecurrent state of the shared memory area are fired. The appropriateresponses to the events are thereby determined by the multimodal engine210. That is, the multimodal engine 210 can detect the occurrence ofevents specified by active modality components and the appropriateresponses for these events can be specified by the inference engine 215.The responses determined by the inference engine 215 can sometimesresult in the activation of a previously deactivated modality componentand the execution of one or more methods provided by the newly activatedmodality component.

One illustrative example showing the system 200 in operation involves amultimodal auto travel application deployed in the thin client 230, suchas a personal data assistant (PDA). In the example, a series of modalitycomponents 235 including an application modality component provided byan application server can be registered with the modality componentserver 205. Registration can include establishing a set of applicationconditions for triggering operations of objects that each modalitycomponent 235 can place in the shared memory area 217.

A communication channel can be established between the thin client 230and the modality component server 205. A user interaction can satisfyone or more activation conditions resulting in an operation of amodality being executed. A communication connection can be establishedbetween the modality component and the modality component server 205.The modality component server 205 can coordinate the interactionsrelating to the auto travel application.

For example, a geographical map can be presented within the PDA and auser can use a digital pen to circle an area of the map. Additionally, auser can speak a phase, such as “find all restaurants located here.”Data representing the digital pen action can be first conveyed to thehandwriting modality component 235. The handwriting modality componentinterprets the digital pen interaction and responsively places asoftware object 250 within the shared memory area 217. The softwareobject 250 can define an area of the map that the user circled.

Data representing the speech input can also be conveyed to a speechmodality component 235. This speech input can be submitted by theapplication module as an object placed within the shared memory area.The speech modality component is activated after the inference engine215 runs the activation conditions against the current state of theshared memory area 217.

An activation condition contained within the activation condition datastore 219 can be specified to resolve a multimodal interaction involvingmultiple modality components. The activation condition is triggered whena speech input is received within a designated time of the receipt of adigital pen input. The firing of this activation condition can result inthe placement of software object 254 into the shared memory area 217,where software object 254 can be an object placed as a result ofdetermining the context of spoken pronouns and other such phrases.

For example, the activation condition can associate the word “here” withthe area defined by the digital pen. The result of firing the activationevent is to locate restaurants within the circle. Once located, theapplication can place software object 256 in the shared memory area 217.The software object 256 can annotate restaurant locations on appropriateareas of a graphical map. Once the map has been constructed, the GUImodality component can be used to convey the map to the thin client 230.The resulting map can visually depict the location of all restaurantswithin the previously circled area of the map.

FIG. 3 is a flow chart illustrating a method 300 for managing multimodalinteractions in accordance with the inventive arrangements disclosedherein. The method 300 can be performed in the context of a modularmultimodal application that manages distributed modality components. Themethod can begin in step 305, where a multitude of modality componentscan be registered. Each modality component can handle a particularinterface modality for the application. Different modality componentscan be provided by different suppliers for each modality. In step 306,an application module can be registered.

In step 307, the application can be initiated from a thin client. Theapplication submits its activation conditions to the multimodal engine.In step 308, the thin client submits its device confirmation, includingthe modality interfaces it supports. In step 309, the modality activatoractivates the registered modality components supported by the device,and connects each modality component directly to the device. In step310, for the life of the application, results of user interactions areplaced on the shared memory area by the modality components. Forinstance, when a speech input is received and interpreted by a speechinput modality component, the component places its interpretation on theshared memory area. In step 311, the inference engine runs against thelist of activation conditions and the current state of the shared memoryarea. In step 312, one or more activation conditions that match againstthe current state of the memory area are selected. The selectedactivation conditions determine the operations the modality componentsare to perform.

In step 313, the application module adds and/or removes activationconditions from the list of activation conditions stored within theactivation condition data store. Each submission and deletion of anactivation condition also counts as an event that causes the inferenceengine to run against the new list of activation conditions. In step314, the modality component performs a programmatic action as a resultof its activation condition being selected. The programmatic operationmay be, for example, the output of text by the speech output modalitycomponent directly to the device. Once the programmatic action has beenperformed, the method can loop to step 310, where the results of theprogrammatic engine can be placed on the multimodal engine's sharedmemory area.

It should be noted that method 300 represents one of a number ofpossible arrangements consistent with the invention as disclosed. Theinvention is not to be construed as limited to the exact detailsspecified in the method 300 and many variations of the illustratedmethod that are consistent with the disclosed inventive arrangements canbe used by one of ordinary skill to achieve equivalent results. Forexample, the activation conditions for each modality component can beplaced on the activation list of the modal component server whenever themodality components are registered and not as specifically detailed inFIG. 3.

The present invention can be realized in hardware, software, or acombination of hardware and software. The present invention can berealized in a centralized fashion in one computer system or in adistributed fashion where different elements are spread across severalinterconnected computer systems. Any kind of computer system or otherapparatus adapted for carrying out the methods described herein issuited. A typical combination of hardware and software can be ageneral-purpose computer system with a computer program that, when beingloaded and executed, controls the computer system such that it carriesout the methods described herein.

The present invention also can be embedded in a computer programproduct, which comprises all the features enabling the implementation ofthe methods described herein, and which when loaded in a computer systemis able to carry out these methods. Computer program in the presentcontext means any expression, hi any language, code or notation, of aset of instructions intended to cause a system having an informationprocessing capability to perform a particular function either directlyor after either or both of the following: a) conversion to anotherlanguage, code or notation; b) reproduction in a different materialform.

This invention can be embodied in other forms without departing from thespirit or essential attributes thereof. Accordingly, reference should bemade to the following claims, rather than to the foregoingspecification, as indicating the scope of the invention.

1. A method for managing distributed multimodal interactions comprising:registering a plurality of distributed modality components with amodality component server, wherein each modality component handles aninterface modality for one or more applications, wherein each modalitycomponent places a set of activation conditions in a shared memory areaof the modality component server, wherein the set of activationconditions defines how the modality component is activated and how inputand output between the modality component and a client device is startedand stopped, wherein activation conditions are added and/or removed byan application module; receiving, from a first multimodal application,activation conditions for at least one of the modality components thatthe first multimodal application supports; matching the activationconditions submitted by the first multimodal application with activationconditions stored in the shared memory area by an inference engine ofthe modality component server; activating a modality component, by amodality activator of the modality component server, when the set ofactivation conditions for said modality component is satisfied by theactivation conditions submitted by the first multimodal application;connecting said activated modality component to a client device on whichthe first multimodal application is executing; and disconnecting theactivated modality component from the client device and deactivating themodality component by the modality activator upon completion of aninteraction response.
 2. A system comprising: a plurality of modalitycomponents comprising at least a first modality component and a secondmodality component, wherein the first modality component handlesinteractions relating to a first modality and wherein the secondmodality component handles interactions relating to a second modalitythat is different from the first modality, wherein each of the first andsecond modality components is implemented in a device-independent waythat supports client devices of different types; and a modalitycomponent server that is accessible to a plurality of client devices,wherein each client device comprises a multimodal application executingthereon, wherein the modality component server provides access for theplurality of client devices to the plurality of modality components. 3.The system of claim 2, wherein each of the first and second modalitycomponents is implemented in an application-independent manner thatsupports multimodal applications of different types.
 4. The system ofclaim 2, wherein the first modality component provides a speech basedmodality.
 5. The system of claim 2, wherein the modality componentserver comprises an interface that converts data conveyances from aformat compatible with the first modality component to formatscompatible with the multimodal applications executing on the pluralityof client devices.
 6. The system of claim 2, wherein the first modalitycomponent provides a speech input modality.
 7. The system of claim 2,wherein the first modality component provides a synthetic speechgeneration modality.
 8. The system of claim 2, wherein the plurality ofclient devices comprise a plurality of cellular telephones, and whereinthe modality component server comprises an interface that is compatiblewith the plurality of cellular telephones.
 9. The system of claim 2, incombination with at least one of the plurality of client devices. 10.The system of claim 3, wherein the first modality component provides aspeech based modality.
 11. The system of claim 3, wherein the firstmodality component provides a speech input modality.
 12. The system ofclaim 11, wherein the plurality of client devices comprise a pluralityof cellular telephones, and wherein the modality component servercomprises an interface that is compatible with the plurality of cellulartelephones.
 13. A method for managing distributed multimodalinteractions comprising: providing a plurality of client devices withaccess to a plurality of modality components comprising at least a firstmodality component and a second modality component, wherein the firstmodality component handles interactions relating to a first modality andwherein the second modality component handles interactions relating to asecond modality that is different from the first modality, wherein eachof the first and second modality components is implemented in adevice-independent way that supports client devices of different types.14. The method of claim 13, wherein each of the first and secondmodality components is implemented in an application-independent mannerthat supports multimodal applications of different types executing onthe plurality of client devices.
 15. The method of claim 14, wherein thefirst modality component provides a speech based modality.
 16. Themethod of claim 13, wherein the first modality component provides aspeech input modality.
 17. The method of claim 13, wherein the firstmodality component provides a synthetic speech generation modality. 18.The method of claim 13, wherein the plurality of client devices comprisea plurality of cellular telephones.
 19. The method of claim 13, whereinthe modality component server comprises an interface that converts dataconveyances from a format compatible with the first modality componentto formats compatible with the multimodal applications executing on theplurality of client devices.
 20. The method of claim 13, wherein thefirst modality component provides a speech based modality.